[TKW] Add support for multiple/local reduceOp #234
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In order to support flash attention, we'd need to be able to expand ReduceOps in the reduction dimension as well. We will do this by expanding the source of ReduceOp and locally reduce all of them. In that effort, we introduce this PR(1st out of 2) that add support of locally reducing over multiple variables.
The second PR on the way would be expansion of ReduceOp.
In this PR we are contributing two things: